Statistical Analysis of Regularization Constant - From Bayes, MDL and NIC Points of View

نویسندگان

  • Shun-ichi Amari
  • Noboru Murata
چکیده

In order to avoid over tting in neural learning, a regularization term is added to the loss function to be minimized. It is naturally derived from the Bayesian standpoint. The present paper studies how to determine the regularization constant from the points of view of the empirical Bayes approach, the maximum description length (MDL) approach, and the network information criterion (NIC) approach. The asymptotic statistical analysis is given to elucidate their di erences. These approaches are tightly connected with the method of model selection. The superiority of the NIC is shown from this analysis.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The meaning of place, A constant or changing quality? Lynch,Rapoport and Semiotics view points

The matter of meaning in place, is one of the main qualities of human life. People consciously or unconsciously looking for meanings in places. The importance of finding the meaning of place is that, Understanding the meaning will lead to “act” in place. Finding the place friendly, or finding it insecure will lead to act differently. Now the question is that, is the meaning of place, something ...

متن کامل

Characterization of the Bayes estimator and the MDL estimator for exponential families

We analyze the relationship between a Minimum Description Length (MDL) estimator (posterior mode) and a Bayes estimator for exponential families. We show the following results concerning these estimators: a) Both the Bayes estimator with Jeffreys prior and the MDL estimator with the uniform prior with respect to the expectation parameter are nearly equivalent to a bias-corrected maximum-likelih...

متن کامل

Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data

One of the defining properties of deep learning is that models are chosen to have many more parameters than available training data. In light of this capacity for overfitting, it is remarkable that simple algorithms like SGD reliably return solutions with low test error. One roadblock to explaining these phenomena in terms of implicit regularization, structural properties of the solution, and/o...

متن کامل

Minimum Description Length Principle

The minimum description length (MDL) principle states that one should prefer the model that yields the shortest description of the data when the complexity of the model itself is also accounted for. MDL provides a versatile approach to statistical modeling. It is applicable to model selection and regularization. Modern versions of MDL lead to robust methods that are well suited for choosing an ...

متن کامل

Safe Learning: bridging the gap between Bayes, MDL and statistical learning theory via empirical convexity

We extend Bayesian MAP and Minimum Description Length (MDL) learning by testing whether the data can be substantially more compressed by a mixture of the MDL/MAP distribution with another element of the model, and adjusting the learning rate if this is the case. While standard Bayes and MDL can fail to converge if the model is wrong, the resulting “safe” estimator continues to achieve good rate...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997